⚡ Cache Optimization - abnv · Scour

Why I’m Building a Database Engine in C# 🗃️Query Compilation

nockawa.github.io·5d·Hacker News·

Stack vs malloc: real-world benchmark shows 2–6x difference 📚Stack Data Structures

blog.stackademic.com

·1d·DEV·

Supercharging Redpanda Streaming with profile-guided optimization 📈Performance Tools

redpanda.com·22h·

Automated Multiphysics For Successful 3D-IC Design ⚡Instruction Fusion

semiengineering.com·15h·

Donald Raab: Measuring the Startup Memory Cost for Lazy Iteration Patterns in Java 🗑️Garbage Collection

donraab.medium.com·2d·

Designing High-Concurrency Databricks Workloads Without Performance Degradation 🗑️Concurrent GC

dzone.com·6d·

Net2Tab: Tabularizing Neural Networks with Applications to Data Prefetching 🗺️Region Inference

sciencedirect.com·1d·

Metal Quantized Attention: pulling M5 Max ahead with Int8 matrix multiplication 🗺️Region Inference

releases.drawthings.ai·1d·Hacker News·

Iteratively optimizing an SPSC queue 🎯Ring Buffers

blog.c21-mac.com·4d·r/cpp·

Intel Announces The "Optimization Zone" ⚡Instruction Fusion

phoronix.com·2d·

Node.js Caching Strategies in Production: In-Memory, Redis, and CDN 🔗Weak References

axiom-experiment.hashnode.dev·6d·DEV·

[Benchmark] 740k QPS Single-thread / 1.45M Dual-thread on a VM. Encountering fluctuations and seeking expert analysis. 🌐WASM Runtimes

github.com·1d·r/java·

The Infrastructure Engineer’s Guide to Benchmarking Price-Performance Gains from Google Cloud VM… ⚡Performance

medium.com·1d·

MXFP8 GEMM: Up to 99% of cuBLAS Performance Using CUDA and PTX 🔬Nanopasses

danielvegamyhre.github.io·4d·Hacker News·

'Performance without compromise': AMD debuts first dual 3D V-Cache Ryzen CPU in potential showdown against Threadripper and EPYC siblings 🎯CPU Dispatch

techradar.com

·2d·

Beating Python’s GIL: Achieving a 130x Speedup in Batch Processing with Rust and Rayon 🦀MIR Optimization

medium.com·2d·

Finding performance bottlenecks with Pyroscope and Alloy: An example using TON blockchain 🔗Hash Algorithms

grafana.com·3d·

JetStream 3: A modern benchmark for high-performance, compute-intensive Web applications ⚡Performance

blog.chromium.org·2d·Hacker News, Blogger·

Stop obsessing over your GPU's core clock — memory clock matters more for local LLM inference 🗺️Region Inference

xda-developers.com·5d·

Boost Training Goodput: How Continuous Checkpointing Optimizes Reliability in Orbax and MaxText 🏰Capability Machines

developers.googleblog.com·2d·

Loading more...